inactive node
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Multi-Layer Feature Reduction for Tree Structured Group Lasso via Hierarchical Projection Jie Wang
Tree structured group Lasso (TGL) is a powerful technique in uncovering the tree structured sparsity over the features, where each node encodes a group of features. It has been applied successfully in many real-world applications. However, with extremely large feature dimensions, solving TGL remains a significant challenge due to its highly complicated regularizer. In this paper, we propose a novel Multi-Layer Feature reduction method (MLFre) to quickly identify the inactive nodes (the groups of features with zero coefficients in the solution) hierarchically in a top-down fashion, which are guaranteed to be irrelevant to the response. Thus, we can remove the detected nodes from the optimization without sacrificing accuracy. The major challenge in developing such testing rules is due to the overlaps between the parents and their children nodes. By a novel hierarchical projection algorithm, MLFre is able to test the nodes independently from any of their ancestor nodes. Moreover, we can integrate MLFre--that has a low computational cost--with any existing solvers. Experiments on both synthetic and real data sets demonstrate that the speedup gained by MLFre can be orders of magnitude.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Causal Incremental Graph Convolution for Recommender System Retraining
Ding, Sihao, Feng, Fuli, He, Xiangnan, Liao, Yong, Shi, Jun, Zhang, Yongdong
Real-world recommender system needs to be regularly retrained to keep with the new data. In this work, we consider how to efficiently retrain graph convolution network (GCN) based recommender models, which are state-of-the-art techniques for collaborative recommendation. To pursue high efficiency, we set the target as using only new data for model updating, meanwhile not sacrificing the recommendation accuracy compared with full model retraining. This is non-trivial to achieve, since the interaction data participates in both the graph structure for model construction and the loss function for model learning, whereas the old graph structure is not allowed to use in model updating. Towards the goal, we propose a \textit{Causal Incremental Graph Convolution} approach, which consists of two new operators named \textit{Incremental Graph Convolution} (IGC) and \textit{Colliding Effect Distillation} (CED) to estimate the output of full graph convolution. In particular, we devise simple and effective modules for IGC to ingeniously combine the old representations and the incremental graph and effectively fuse the long-term and short-term preference signals. CED aims to avoid the out-of-date issue of inactive nodes that are not in the incremental graph, which connects the new data with inactive nodes through causal inference. In particular, CED estimates the causal effect of new data on the representation of inactive nodes through the control of their collider. Extensive experiments on three real-world datasets demonstrate both accuracy gains and significant speed-ups over the existing retraining mechanism.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Singapore (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
- (4 more...)
Guided Dropout
Keshari, Rohit, Singh, Richa, Vatsa, Mayank
Dropout is often used in deep neural networks to prevent over-fitting. Conventionally, dropout training invokes \textit{random drop} of nodes from the hidden layers of a Neural Network. It is our hypothesis that a guided selection of nodes for intelligent dropout can lead to better generalization as compared to the traditional dropout. In this research, we propose "guided dropout" for training deep neural network which drop nodes by measuring the strength of each node. We also demonstrate that conventional dropout is a specific case of the proposed guided dropout. Experimental evaluation on multiple datasets including MNIST, CIFAR10, CIFAR100, SVHN, and Tiny ImageNet demonstrate the efficacy of the proposed guided dropout.
Topological Recurrent Neural Network for Diffusion Prediction
Wang, Jia, Zheng, Vincent W., Liu, Zemin, Chang, Kevin Chen-Chuan
Information diffusion is a common phenomenon on social networks [1], [2]. Its modeling has many applications, such as helping to predict which user is an opinion leader [3], how much a cascade will grow [4], who are the diffusion sources [5], which user will digg a particular story [6], and so on. In this paper, we study the task of information diffusion prediction. The goal is to design an effective diffusion model, which can estimate the activation probability for an inactive node in a cascade. We consider the most standard setting of information diffusion, where we have inputs of: 1) a data graph G (V, E), where V is the set of nodes and E is the set of edges; 2) a set of cascade sequences, each of which is an ordered sequence of node activation over V. For example, in Figure 1, the data graph G is a network of seven nodes; a cascade sequence A B C D is a sequence of nodes ordered by their activation time stamps. Early work assumes diffusion model as given, such as independent cascade (IC) and linear threshold (LT) [3]. There are many extensions of the IC and LT models, such as continuoustime IC [7].
- North America > United States > Illinois (0.04)
- Asia > Singapore (0.04)
- Asia > China (0.04)
Multi-Layer Feature Reduction for Tree Structured Group Lasso via Hierarchical Projection
Tree structured group Lasso (TGL) is a powerful technique in uncovering the tree structured sparsity over the features, where each node encodes a group of features. It has been applied successfully in many real-world applications. However, with extremely large feature dimensions, solving TGL remains a significant challenge due to its highly complicated regularizer. In this paper, we propose a novel Multi-Layer Feature reduction method (MLFre) to quickly identify the inactive nodes (the groups of features with zero coefficients in the solution) hierarchically in a top-down fashion, which are guaranteed to be irrelevant to the response. Thus, we can remove the detected nodes from the optimization without sacrificing accuracy. The major challenge in developing such testing rules is due to the overlaps between the parents and their children nodes. By a novel hierarchical projection algorithm, MLFre is able to test the nodes independently from any of their ancestor nodes. Moreover, we can integrate MLFre---that has a low computational cost---with any existing solvers. Experiments on both synthetic and real data sets demonstrate that the speedup gained by MLFre can be orders of magnitude.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)